AITopics | maximum entropy model

Collaborating Authors

maximum entropy model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Efficient first-order algorithms for large-scale, non-smooth maximum entropy models with application to wildfire science

Langlois, Gabriel P., Buch, Jatan, Darbon, Jérôme

arXiv.org Machine LearningMar-11-2024

Maximum entropy (Maxent) models are a class of statistical models that use the maximum entropy principle to estimate probability distributions from data. Due to the size of modern data sets, Maxent models need efficient optimization algorithms to scale well for big data applications. State-of-the-art algorithms for Maxent models, however, were not originally designed to handle big data sets; these algorithms either rely on technical devices that may yield unreliable numerical results, scale poorly, or require smoothness assumptions that many practical Maxent models lack. In this paper, we present novel optimization algorithms that overcome the shortcomings of state-of-the-art algorithms for training large-scale, non-smooth Maxent models. Our proposed first-order algorithms leverage the Kullback-Leibler divergence to train large-scale and non-smooth Maxent models efficiently. For Maxent models with discrete probability distribution of $n$ elements built from samples, each containing $m$ features, the stepsize parameters estimation and iterations in our algorithms scale on the order of $O(mn)$ operations and can be trivially parallelized. Moreover, the strong $\ell_{1}$ convexity of the Kullback--Leibler divergence allows for larger stepsize parameters, thereby speeding up the convergence rate of our algorithms. To illustrate the efficiency of our novel algorithms, we consider the problem of estimating probabilities of fire occurrences as a function of ecological features in the Western US MTBS-Interagency wildfire data set. Our numerical results show that our algorithms outperform the state of the arts by one order of magnitude and yield results that agree with physical models of wildfire occurrence and previous statistical analyses of wildfire drivers.

algorithm, maxent model, stepsize parameter, (16 more...)

arXiv.org Machine Learning

2403.06816

Country:

North America > United States > New York > New York County > New York City (0.28)
North America > United States > New York > Richmond County > New York City (0.04)
North America > United States > New York > Queens County > New York City (0.04)
(11 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Government > Regional Government > North America Government > United States Government (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.82)

Add feedback

Scattering Spectra Models for Physics

Cheng, Sihao, Morel, Rudy, Allys, Erwan, Ménard, Brice, Mallat, Stéphane

arXiv.org Artificial IntelligenceJun-29-2023

Physicists routinely need probabilistic models for a number of tasks such as parameter inference or the generation of new realizations of a field. Establishing such models for highly non-Gaussian fields is a challenge, especially when the number of samples is limited. In this paper, we introduce scattering spectra models for stationary fields and we show that they provide accurate and robust statistical descriptions of a wide range of fields encountered in physics. These models are based on covariances of scattering coefficients, i.e. wavelet decomposition of a field coupled with a point-wise modulus. After introducing useful dimension reductions taking advantage of the regularity of a field under rotation and scaling, we validate these models on various multi-scale physical fields and demonstrate that they reproduce standard statistics, including spatial moments up to 4th order. These scattering spectra provide us with a low-dimensional structured representation that captures key properties encountered in a wide range of physical fields. These generic models can be used for data exploration, classification, parameter inference, symmetry detection, and component separation.

artificial intelligence, data quality, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2306.1721

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > France > Île-de-France > Paris > Paris (0.04)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Data Science > Data Quality > Data Transformation (0.31)

Add feedback

Scale Dependencies and Self-Similar Models with Wavelet Scattering Spectra

Morel, Rudy, Rochette, Gaspar, Leonarduzzi, Roberto, Bouchaud, Jean-Philippe, Mallat, Stéphane

arXiv.org Artificial IntelligenceJun-19-2023

We introduce the wavelet scattering spectra which provide non-Gaussian models of time-series having stationary increments. A complex wavelet transform computes signal variations at each scale. Dependencies across scales are captured by the joint correlation across time and scales of wavelet coefficients and their modulus. This correlation matrix is nearly diagonalized by a second wavelet transform, which defines the scattering spectra. We show that this vector of moments characterizes a wide range of non-Gaussian properties of multi-scale processes. We prove that self-similar processes have scattering spectra which are scale invariant. This property can be tested statistically on a single realization and defines a class of wide-sense self-similar processes. We build maximum entropy models conditioned by scattering spectra coefficients, and generate new time-series with a microcanonical sampling algorithm. Applications are shown for highly non-Gaussian financial and turbulence time-series.

artificial intelligence, data quality, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2204.10177

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Industry: Banking & Finance (1.00)

Technology:

Information Technology > Data Science > Data Quality > Data Transformation (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

How biased are maximum entropy models?

Neural Information Processing SystemsApr-6-2023, 13:13:28 GMT

Maximum entropy models have become popular statistical models in neuroscience and other areas in biology, and can be useful tools for obtaining estimates of mu- tual information in biological systems. However, maximum entropy models fit to small data sets can be subject to sampling bias; i.e. the true entropy of the data can be severely underestimated. Here we study the sampling properties of estimates of the entropy obtained from maximum entropy models. We show that if the data is generated by a distribution that lies in the model class, the bias is equal to the number of parameters divided by twice the number of observations. However, in practice, the true distribution is usually outside the model class, and we show here that this misspecification can lead to much larger bias. We provide a perturba- tive approximation of the maximally expected bias when the true model is out of model class, and we illustrate our results using numerical simulations of an Ising model; i.e. the second-order maximum entropy distribution on binary data.

maximum entropy model, model class

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (1.00)

Add feedback

How biased are maximum entropy models?

Macke, Jakob H., Murray, Iain, Latham, Peter E.

Neural Information Processing SystemsFeb-14-2020, 23:27:55 GMT

maximum entropy model, model class

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (1.00)

Add feedback

Maximum Entropy Models from Phase Harmonic Covariances

Zhang, Sixin, Mallat, Stéphane

arXiv.org Machine LearningNov-22-2019

Maximum Entropy Models from Phase Harmonic Covariances Sixin Zhang 1, 4, St ephane Mallat 1, 2,3 1 ENS, PSL University, Paris, France 2 Coll ege de France, Paris, France 3 Flatiron Institute, New York, USA 4 Center for Data Science, Peking University, Beijing, China November 25, 2019 Abstract We define maximum entropy models of non-Gaussian stationary random vectors from covariances of nonlinear representations. These representations are calculated by multiplying the phase of Fourier or wavelet coefficients with harmonic integers, which amounts to compute a windowed Fourier transform along their phase. Rectifiers in neural networks compute such phase windowing. The covariance of these harmonic coefficients capture dependencies of Fourier and wavelet coefficients across frequencies, by canceling their random phase. We introduce maximum entropy models conditioned by such covariances over a graph of local interactions. These models are approximated by transporting an initial maximum ...

coefficient, covariance, wavelet coefficient, (13 more...)

arXiv.org Machine Learning

1911.10017

Country:

Europe > France > Île-de-France > Paris > Paris (0.44)
Asia > China > Beijing > Beijing (0.24)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Quality > Data Transformation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (1.00)

Add feedback

Part-of-Speech Tagging

#artificialintelligenceSep-8-2019, 16:34:29 GMT

Rule-Based: A dictionary is constructed with possible tags for each word. Rules are either hand-crafted, learned or both. An example rule might say, "If an ambiguous/unknown word X is preceded by a determiner and followed by a noun, tag it as an adjective." Statistical: A text corpus is used to derive useful probabilities. Given a sequence of words, the most probable sequence of tags is selected.

artificial intelligence, machine learning, stochastic method, (14 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.43)

Add feedback

An Annotated Corpus for Machine Reading of Instructions in Wet Lab Protocols

Kulkarni, Chaitanya, Xu, Wei, Ritter, Alan, Machiraju, Raghu

arXiv.org Artificial IntelligenceMay-1-2018

We describe an effort to annotate a corpus of natural language instructions consisting of 622 wet lab protocols to facilitate automatic or semi-automatic conversion of protocols into a machine-readable format and benefit biological research. Experimental results demonstrate the utility of our corpus for developing machine learning approaches to shallow semantic parsing of instructional texts. We make our annotated Wet Lab Protocol Corpus available to the research community.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

1805.00195

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Maximum entropy models capture melodic styles

Sakellariou, Jason, Tria, Francesca, Loreto, Vittorio, Pachet, François

arXiv.org Machine LearningOct-11-2016

We introduce a Maximum Entropy model able to capture the statistics of melodies in music. The model can be used to generate new melodies that emulate the style of the musical corpus which was used to train it. Instead of using the $n-$body interactions of $(n-1)-$order Markov models, traditionally used in automatic music generation, we use a $k-$nearest neighbour model with pairwise interactions only. In that way, we keep the number of parameters low and avoid over-fitting problems typical of Markov models. We show that long-range musical phrases don't need to be explicitly enforced using high-order Markov interactions, but can instead emerge from multiple, competing, pairwise interactions. We validate our Maximum Entropy model by contrasting how much the generated sequences capture the style of the original corpus without plagiarizing it. To this end we use a data-compression approach to discriminate the levels of borrowing and innovation featured by the artificial sequences. The results show that our modelling scheme outperforms both fixed-order and variable-order Markov models. This shows that, despite being based only on pairwise interactions, this Maximum Entropy scheme opens the possibility to generate musically sensible alterations of the original phrases, providing a way to generate innovation.

artificial intelligence, machine learning, sequence, (18 more...)

arXiv.org Machine Learning

1610.03414

Country: Europe > France (0.28)

Genre: Research Report > New Finding (0.48)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.76)

Add feedback

Can we use gradient desent method in maximum entropy model?

#artificialintelligenceJul-14-2016, 08:30:37 GMT

I see a lot of implementations use GIS or IIS to train the maximum entropy model. Can we use gradient desent method? If we can use it, why most tutorial directly tell GIS or IIS methos, but do not show the simple gradient desent method to train maximum entropy model? As we know, softmax regression is equivalent to the maxent model, but I never heard GIS or IIS in softmax. Is there a toy code use simple gradient desent method to train maxent model?

artificial intelligence, gradient desent method, machine learning, (4 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.99)

Add feedback